Paper link: [here](https://ieeexplore-ieee-org.ezproxy.library.wisc.edu/document/8916468)

This paper proposes a software level algorithm on approximation matrix computing by sampling. The main takeaway is that for A x B = C, though sampling the norm of columns of A and rows of B and discards the data lines with less norm. we can achieve approximate computing on the matrix while saving amount of matrix computation.

This paper only proposes the algorithm on the software simulation level, and I wish I would implement this on a hardware level (via RTL simulation or FPGA).

Paper link: [here](https://ieeexplore-ieee-org.ezproxy.library.wisc.edu/document/7783746)

This is the second paper review so I will skip the introduction. In the paper the author mentioned that the access mapping function’s parameters are programmer defined. I wish I would implement a DSP based co-processor for the bunker cache to calculate the parameters via Discrete Fourier Transform (DFT) during data IO from storage to memory. So that the cache can be implemented without ISA support

I have been working on a RISCV processor core for a while now, and I wish I could implement a matrix multiplication co-processor for it for possible graphical application. I’m not sure if this could become a valid project for the course.